15 research outputs found
Multi-GPU aggregation-based AMG preconditioner for iterative linear solvers
We present and release in open source format a sparse linear solver which
efficiently exploits heterogeneous parallel computers. The solver can be easily
integrated into scientific applications that need to solve large and sparse
linear systems on modern parallel computers made of hybrid nodes hosting NVIDIA
Graphics Processing Unit (GPU) accelerators.
The work extends our previous efforts in the exploitation of a single GPU
accelerator and proposes an implementation, based on the hybrid MPI-CUDA
software environment, of a Krylov-type linear solver relying on an efficient
Algebraic MultiGrid (AMG) preconditioner already available in the BootCMatchG
library. Our design for the hybrid implementation has been driven by the best
practices for minimizing data communication overhead when multiple GPUs are
employed, yet preserving the efficiency of the single GPU kernels. Strong and
weak scalability results on well-known benchmark test cases of the new version
of the library are discussed. Comparisons with the Nvidia AmgX solution show an
improvement of up to 2.0x in the solve phase
Why diffusion-based preconditioning of Richards equation works: spectral analysis and computational experiments at very large scale
We consider here a cell-centered finite difference approximation of the
Richards equation in three dimensions, averaging for interface values the
hydraulic conductivity , a highly nonlinear function, by arithmetic,
upstream, and harmonic means. The nonlinearities in the equation can lead to
changes in soil conductivity over several orders of magnitude and
discretizations with respect to space variables often produce stiff systems of
differential equations. A fully implicit time discretization is provided by
\emph{backward Euler} one-step formula; the resulting nonlinear algebraic
system is solved by an inexact Newton Armijo-Goldstein algorithm, requiring the
solution of a sequence of linear systems involving Jacobian matrices. We prove
some new results concerning the distribution of the Jacobians eigenvalues and
the explicit expression of their entries. Moreover, we explore some connections
between the saturation of the soil and the ill-conditioning of the Jacobians.
The information on eigenvalues justifies the effectiveness of some
preconditioner approaches which are widely used in the solution of Richards
equation. We also propose a new software framework to experiment with scalable
and robust preconditioners suitable for efficient parallel simulations at very
large scales. Performance results on a literature test case show that our
framework is very promising in the advance towards realistic simulations at
extreme scale
A Novel Aggregation Method based on Graph Matching for Algebraic MultiGrid Preconditioning of Sparse Linear Systems
Session 11International audienc
Efficient algebraic multigrid preconditioners on clusters of GPUs
Many scientific applications require the solution of large and sparse linear systems of equations using Krylov subspace methods; in this case, the choice of an effective preconditioner may be crucial for the convergence of the Krylov solver. Algebraic MultiGrid (AMG) methods are widely used as preconditioners, because of their optimal computational cost and their algorithmic scalability. The wide availability of GPUs, now found in many of the fastest supercomputers, poses the problem of implementing efficiently these methods on high-throughput processors. In this work we focus on the application phase of AMG preconditioners, and in particular on the choice and implementation of smoothers and coarsest-level solvers capable of exploiting the computational power of clusters of GPUs. We consider block-Jacobi smoothers using sparse approximate inverses in the solve phase associated with the local blocks. The choice of approximate inverses instead of sparse matrix factorizations is driven by the large amount of parallelism exposed by the matrix-vector product as compared to the solution of large triangular systems on GPUs. The selected smoothers and solvers are implemented within the AMG preconditioning framework provided by the MLD2P4 library, using suitable sparse matrix data structures from the PSBLAS library. Their behaviour is illustrated in terms of execution speed and scalability, on a test case concerning groundwater modelling, provided by the JĂĽlich Supercomputing Center within the Horizon 2020 Project EoCoE
TEXTAROSSA: Towards EXtreme scale Technologies and Accelerators for euROhpc hw/Sw Supercomputing Applications for exascale
International audienceTo achieve high performance and high energy efficiency on near-future exascale computing systems, three key technology gaps needs to be bridged. These gaps include: energy efficiency and thermal control; extreme computation efficiency via HW acceleration and new arithmetics; methods andtools for seamless integration of reconfigurable accelerators in heterogeneous HPC multi-node platforms. TEXTAROSSA aims at tackling this gap through a co-design approach to heterogeneous HPC solutions, supported by the integration and extension of HW and SW IPs, programming models and tools derived from European research
Solution of Ambrosio-Tortorelli model for image segmentation by generalized relaxation method
Image segmentation addresses the problem to partition a given image into its constituent objects and then to identify the boundaries of the objects. This problem can be formulated in terms of a variational model aimed to find optimal approximations of a bounded function by piecewise-smooth functions, minimizing a given functional. The corresponding Euler-Lagrange equations are a set of two coupled elliptic partial differential equations with varying coefficients. Numerical solution of the above system often relies on alternating minimization techniques involving descent methods coupled with explicit or semi-implicit finite-difference discretization schemes, which are slowly convergent and poorly scalable with respect to image size. In this work we focus on generalized relaxation methods also coupled with multigrid linear solvers, when a finite-difference discretization is applied to the Euler-Lagrange equations of Ambrosio-Tortorelli model. We show that non-linear Gauss-Seidel, accelerated by inner linear iterations, is an effective method for large-scale image analysis as those arising from high-throughput screening platforms for stem cells targeted differentiation, where one of the main goal is segmentation of thousand of images to analyze cell colonies morphology
Reprint of Solution of Ambrosio-Tortorelli model for image segmentation by generalized relaxation method
Image segmentation addresses the problem to partition a given image into its constituent objects and then to identify the boundaries of the objects. This problem can be formulated in terms of a variational model aimed to find optimal approximations of a bounded function by piecewise-smooth functions, minimizing a given functional. The corresponding Euler-Lagrange equations are a set of two coupled elliptic partial differential equations with varying coefficients. Numerical solution of the above system often relies on alternating minimization techniques involving descent methods coupled with explicit or semi-implicit finite-difference discretization schemes, which are slowly convergent and poorly scalable with respect to image size. In this work we focus on generalized relaxation methods also coupled with multigrid linear solvers, when a finite-difference discretization is applied to the Euler-Lagrange equations of Ambrosio-Tortorelli model. We show that non-linear Gauss-Seidel, accelerated by inner linear iterations, is an effective method for large-scale image analysis as those arising from high-throughput screening platforms for stem cells targeted differentiation, where one of the main goal is segmentation of thousand of images to analyze cell colonies morphology